Audio-Based Distributional Representations of Meaning Using a Fusion of Feature Encodings
نویسندگان
چکیده
Recently a “Bag-of-Audio-Words” approach was proposed [1] for the combination of lexical features with audio clips in a multimodal semantic representation, i.e., an Audio Distributional Semantic Model (ADSM). An important step towards the creation of ADSMs is the estimation of the semantic distance between clips in the acoustic space, which is especially challenging given the diversity of audio collections. In this work, we investigate the use of different feature encodings in order to address this challenge following a two-step approach. First, an audio clip is categorized with respect to three classes, namely, music, speech and other. Next, the feature encodings are fused according to the posterior probabilities estimated in the previous step. Using a collection of audio clips annotated with tags we derive a mapping between words and audio clips. Based on this mapping and the proposed audio semantic distance, we construct an ADSM model in order to compute the distance between words (lexical semantic similarity task). The proposed model is shown to significantly outperform (23.6% relative improvement in correlation coefficient) the state-of-the-art results reported in the literature.
منابع مشابه
Are Distributional Representations Ready for the Real World? Evaluating Word Vectors for Grounded Perceptual Meaning
Distributional word representation methods exploit word co-occurrences to build compact vector encodings of words. While these representations enjoy widespread use in modern natural language processing, it is unclear whether they accurately encode all necessary facets of conceptual meaning. In this paper, we evaluate how well these representations can predict perceptual and conceptual features ...
متن کاملDiverse Context for Learning Word Representations
Word representations are mathematical objects that capture a word’s meaning and its grammatical properties in a way that can be read and understood by computers. Word representations map words into equivalence classes such that words that share similar properties to each other are part of the same equivalence class. Word representations are either constructed manually by humans (in the form of ...
متن کاملAn Improved Tabu Search Algorithm for Job Shop Scheduling Problem Trough Hybrid Solution Representations
Job shop scheduling problem (JSP) is an attractive field for researchers and production managers since it is a famous problem in many industries and a complex problem for researchers. Due to NP-hardness property of this problem, many meta-heuristics are developed to solve it. Solution representation (solution seed) is an important element for any meta-heuristic algorithm. Therefore, many resear...
متن کاملRedundancy in Perceptual and Linguistic Experience: Comparing Feature-Based and Distributional Models of Semantic Representation
Since their inception, distributional models of semantics have been criticized as inadequate cognitive theories of human semantic learning and representation. A principal challenge is that the representations derived by distributional models are purely symbolic and are not grounded in perception and action; this challenge has led many to favor feature-based models of semantic representation. We...
متن کاملConstructing semantic representations using the MDL principle
Words receive a signiicant part of their meaning from use in communicative settings. The formal mechanisms of lexical acquisition, as they apply to rich situational settings, may also be studied in the limited case of corpora of written texts. This work constitutes an approach to deriving semantic representations for lexemes using techniques from statistical induction. In particular, a number o...
متن کامل